Spider silk protein encoding nucleic acids, polypeptides, antibodies and methods of use thereof

ABSTRACT

Spider silk protein encoding nucleic acids, polypeptides and antibodies immunologically specific therefore are disclosed. Methods of use thereof are also provided.

This application is a §371 application of PCT/US02/09663, filed Mar. 28,2002, which in turn claims priority to U.S. Provisional Application60/315,529 filed August 29, 2001, the entire disclosure of each of theabove-identified applications being incorporated by reference herein.

Pursuant to 35 U.S.C. §202(c) it is acknowledged that the U.S.Government has certain rights in the invention described herein, whichwas made in part with funds from the National Science Foundation, GrantNumber MCB-9806999.

FIELD OF THE INVENTION

This invention relates to the fields of molecular and cellular biology.Specifically, nucleic acids encoding spider silk polypeptides, spidersilk polypeptides, spider silk polypeptide-specific antibodies, andmethods of use thereof are provided.

BACKGROUND OF THE INVENTION

Several publications are referenced in this application by numerals inparentheses in order to more fully describe the state of the art towhich this invention pertains. Full citations for these references arefound at the end of the specification. The disclosure of each of thesepublications is incorporated by reference herein.

Spider silks comprise a model system for exploring the relationshipbetween the amino acid composition of a protein, the structuralproperties that result from variations in the amino acid composition ofa protein, and how such variations impact protein function. While silkproduction has evolved multiple times within arthropods, silk use ismost highly developed in spiders. Spiders are unique in their lifelongability to spin an array of different silk proteins (or fibroinproteins) and the degree to which they depend on this ability. There areover 34,000 described species of Araneae (1). Each species utilizessilk, and some ecribellate orb-weavers (Araneoidea) have a variedtoolkit of task-specific silks with divergent mechanical properties (2).Araneoid major ampullate silk, the primary dragline, is extremely tough.Minor ampullate silk, used in web construction, has high tensilestrength. An orb-web's capture spiral, in part composed of flagelliformsilk, is elastic and can triple in length before breaking (3). Each ofthese fibers is composed of one or more proteins encoded by the spidersilk fibroin gene family (4). Sequencing of araneoid fibroins hasrevealed that these fibroins are dominated by iterations of four simpleamino acid motifs: poly-alanine (A_(n)), alternating glycine and alanine(GA), GGX (where X represents a small subset of amino acids), andGPG(X)_(n) (5).

Spiders draw fibers from dissolved fibroin proteins that are stored inspecialized sets of abdominal glands. Each type of silk is secreted andstored by a different abdominal gland until extruded by tiny spigots onthe spinnerets. Spiders use fibroin proteins singly or in combinationfor a variety of different purposes, including: draglines, retreats, eggsacs, and prey-catching snares. Given these specialized applications,individual silks appear to have evolved to possess mechanical properties(e.g., tensile strength and flexibility) that optimize their utility forparticular applications.

Orb web spiders like Nephila are known to produce spider silk proteinsderived from several types of silk synthetic glands and are designatedaccording to their organ of origin. Spider silk proteins known to existinclude: major ampullate spider proteins (MaSp), minor ampullate spiderproteins (MiSp), and flagelliform (Flag), tubuliform, aggregate,aciniform, and pyriform spider silk proteins. Spider silk proteinsderived from each organ are generally distinguishable from those derivedfrom other synthetic organs by virtue of-their physical and chemicalproperties, which render them well suited to different uses. Tubuliformsilk, for example, is used in the outer layers of egg-sacs, whereasaciniform silk is involved in wrapping prey and pyriform silk is laiddown as the attachment disk.

Most molecular and structural investigations of spider silks havefocused on dragline silk, which has an extraordinarily high tensilestrength (e.g. Xu & Lewis, Proc. Natl. Acad. Sci., USA 87, 7120-7124,1990; Hinman & Lewis, J. Biol. Chem. 267, 19320-19324, 1992; Thiel etal., Biopolymers 34, 1089-1097, 1994; Simmons et al., Science 271,84-87, 1996; Kümmerlen et al., Macromol. 29, 2920-2928, 1996; and Osaki,Nature 384, 419, 1996). Dragline silk, often referred to as majorampullate silk because it is produced by the major ampullate glands, hasa high tensile strength (5×10⁹ Nm⁻²) similar to Kevlar (4×10⁹ Nm⁻²)(Gosline et al., Endeavour 10, 37-43, 1986; Stauffer et al., J.Arachnol. 22, 5-11, 1994). In addition to this exceptional strength,dragline silk also exhibits substantial (˜35%) elasticity (Gosline etal., Endeavour 10, 37-43, 1986). Thus a structure/function analysis ofdragline silk is revealing in terms of the features of a protein whichconfer strength and elasticity.

Silk strength is widely attributed to crystalline beta-sheet structures.Such protein domains are found in both lepidopteran silks (e.g. Bombyxmori, Mita et al., J. Mol. Evol. 38, 583-592, 1994) and spider silks (Xu& Lewis, Proc. Natl. Acad. Sci., USA 87, 7120-7124, 1990; Hinman &Lewis, J. Biol. Chem. 267, 19320-19324, 1992; Gosline et al., Endeavour10, 37-43, 1986). In contrast, elasticity is generally thought toinvolve amorphous regions (Wainwright et al., Mechanical design inorganisms, Princeton University Press, Princeton, 1982). More precisecharacterization of these amorphous components can be revealed bymolecular sequence data.

Based on the protein sequences of major ampullate silk proteins, abeta-turn structure was suggested to be the likely mechanism ofelasticity (Hinman & Lewis, J. Biol. Chem. 267, 19320-19324, 1992).Assessing this proposition, however, was problematic because draglinesilk is a hybrid of at least two distinct proteins which impart bothstrength and moderate elasticity.

Nephila minor ampullate silk can be distinguished from Nephila majorampullate silk by both physical and chemical properties. On a basiclevel, the amino acid composition of solubilized minor ampullate silkdiffers from that of solubilized major ampullate silk. Like the majorampullate silk proteins (major spidroin 1, MaSP1; major spidroin 2,MaSP2), the proteins comprising minor ampullate silk (minor spindroin 1,MiSP1; minor spindroin 2, MiSP2) have a primary structure dominated byimperfect repetition of a short sequence of amino acids. Moreover, incontrast to the elasticity exhibited by major ampullate silk, minorampullate silk yields without recoil. Minor ampullate silk will stretchto about 25% of its initial length before breaking, thereby exhibiting atensile strength of nearly 100,000 pounds per square inch (psi). Theminor ampullate silk proteins, therefore, exhibit comparatively lowertensile strength and elasticity relative to major ampullate silkproteins.

The capture spiral, on the other hand, is formed from silk proteinsderived from the flagelliform and aggregate silk glands. The capturespiral of an orb-web comprises a structure having significant ability tostretch, as would be anticipated for a structure that must capture andretain prey. The capture thread has a lower tensile strength (1×10⁹Nm⁻²) but several times the elasticity (>200%) of dragline silk(Vollrath & Edmonds, Nature 340, 305-307, 1989; Kohler & Vollrath, J.Exp. Zool. 271, 1-17, 1995). The flagelliform silk comprises the corefiber of the spiral, while aggregate silk provides a non-fibrous,aqueous coating. Thus, while aggregate silk is an integral part of theelastic capture spiral, it is flagelliform silk that provides theability to stretch.

SUMMARY OF THE INVENTION

In view of the unique properties of different silks produced by spiders,the identification of novel spider silk proteins and characterization oftheir chemical and physical properties provide useful new reagentshaving utility for a number of applications. Spider silk proteins areunique in that they possess properties which include, but are notlimited to, high tensile strength and elasticity. Moreover, individualspider silk proteins have evolved to possess different combinations ofproperties that contribute to the physical balance between proteinstrength and elasticity.

Spider silk is composed of fibers formed from proteins. Naturallyoccurring spider silk fibers can be composites of two or more proteins.In general, spider silk proteins are found to have primary amino acidsequences that can be characterized as indirect repeats of a shortconsensus sequence. Variation in the consensus sequence is thenresponsible for the distinguishable properties of different silkproteins.

Silk fibers can be made from synthetic polypeptides having amino acidsequences substantially similar to a consensus repeat unit of a silkprotein or from polypeptides expressed from nucleic acid sequencesencoding a natural or engineered silk protein, or derivative thereof.Depending on the application for which a synthetic spider silk proteinis intended, it may also be desirable to form fibers from a singlespider silk protein or combinations of different spider silk proteins,the ratio of which can be modified accordingly.

According to one aspect of the invention, nucleic acid sequencesencoding novel spider silk-proteins are provided. Exemplary nucleic acidsequences of the invention have sequences comprising SEQ ID NOS: 1-28.

In a particular aspect of the invention, exemplary nucleic acidsequences encoding novel MaSp1-like spider silk proteins are provided.Exemplary nucleic acid sequences of this type have sequences comprisingSEQ ID NOs: 1-7.

In another aspect of the invention, exemplary nucleic acid sequencesencoding novel MaSp2-like spider silk proteins are provided. Exemplarynucleic acid sequences of this type have sequences comprising SEQ IDNOs: 8-16.

In another aspect of the invention, exemplary nucleic acid sequencesencoding novel flagelliform (flag)-like spider silk proteins areprovided. Exemplary nucleic acid sequences of this type have sequencescomprising SEQ ID NOs: 17 and 18.

In another aspect of the invention, nucleic acid sequences encodingnovel spider silk proteins are provided. Exemplary nucleic acidsequences of this type have sequences comprising SEQ ID NOs: 19 and 20.

In yet another aspect of the invention, nucleic acid sequences encodingnovel spider silk proteins which comprise atypical repetitive motifs areprovided. Exemplary nucleic acid sequences of this type have sequencescomprising SEQ ID NOs: 21-27.

In a particular aspect of the invention, an isolated nucleic acidsequence which encodes a novel spider silk protein comprising atypicalrepetitive motifs is provided. An exemplary nucleic acid sequence ofthis type has a sequence comprising SEQ ID NO: 28.

In a preferred embodiment of the invention, the isolated nucleic acidmolecules provided encode spider silk proteins. In a particularlypreferred embodiment, spider silk proteins of the present invention haveamino acid sequences comprising SEQ ID NOS: 29-56.

In a particular aspect of the invention, novel MaSp1-like spider silkproteins have amino acid sequences comprising SEQ ID NOs: 29-35.

In another aspect of the invention, novel MaSp2-like spider silkproteins have amino acid sequences comprising SEQ ID NOs: 36-44.

In another aspect of the invention, novel flag-like spider silk proteinshave amino acid sequences comprising SEQ ID NOs: 45 and 46.

In another aspect of the invention, novel spider silk proteins haveamino acid sequences comprising SEQ ID NOs: 47 and 48.

In yet another aspect of the invention, novel spider silk proteinscomprising atypical repetitive motifs have amino acid sequencescomprising SEQ ID NOs: 49-55.

In a particular aspect of the invention, a novel spider silk proteincomprising atypical repetitive motifs is provided. An exemplary spidersilk protein amino acid sequence of this type comprises SEQ ID NO: 56.

According to another aspect of the present invention, an isolatednucleic acid molecule is provided, which has a sequence selected fromthe group consisting of: (1) SEQ ID NOs: 1-28; (2) a sequencespecifically hybridizing with preselected portions or all of anindividual complementary strand of SEQ ID NOs: 1-28 comprising nucleicacids encoding amino acids of SEQ ID NOs: 29-56; (3) a sequence encodingpreselected portions of SEQ ID NOs: 1-28, and (4) a sequence comprisingnucleic acids encoding amino acids of a consensus sequence (SEQ ID NO:57) which was derived from SEQ ID NO: 56.

Such partial sequences are useful as probes to identify and isolatehomologues of spider silk protein genes of the invention. Additionally,isolated nucleic acid sequences encoding natural allelic variants of thenucleic acids of SEQ ID NOs: 1-28 are also contemplated to be within thescope of the present invention. The term natural allelic variants willbe defined hereinbelow.

According to another aspect of the present invention, antibodiesimmunologically specific for the spider silk proteins describedhereinabove are provided.

In yet another aspect of the invention, host cells comprising at leastone of the spider silk protein encoding nucleic acids are provided. Suchhost cells include but are not limited to bacterial cells, fungal cells,insect cells, mammalian cells, and plant cells. Host cellsoverexpressing one or more of the spider silk protein encoding nucleicacids of the invention provide valuable reagents for many applications,including, but not limited to, production of silk fibers comprising atleast one silk protein that can be incorporated into a material tomodulate the structural properties of the material.

Naturally occurring spider silk proteins have an imperfectly repetitivestructure. Imperfections in the repetition are likely to be aconsequence of the process by which the silk protein genes evolved,rather than a requirement for fiber formation. Imperfections inrepetition are thus not likely to affect properties of fibers formedfollowing aggregation of protein molecules.

Accordingly, in another embodiment of the present invention nucleic acidsequences are provided which encode engineered spider silk proteins,each of which comprises a polypeptide having direct repeats of a unitamino acid sequence. Alternatively, nucleic acid sequences may includeseveral different unit amino acid sequences to form a “copolymer” silkprotein.

In yet another embodiment of the present invention a spider silk proteinexpressed from a nucleic acid sequence is provided, wherein the nucleicacid sequence is obtained from cDNA, genomic DNA, synthetic DNA, orfragments of all of the above, derived from a spider ampullate gland.

In another embodiment of the present invention fibers made from silkprotein obtained by expression of nucleic acid sequences encoding atleast one spider silk protein are provided.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 shows an analysis of phylogenetic relationships of Araneae basedon morphological evidence (1,27). Previously published spider fibroinsequences are from the two genera marked by white circles. Includingdata presented herein, fibroin sequences have been characterized for thetaxa in red. Circles at internal nodes mark fossil calibration points.Extinct taxa that calibrate these nodes, Macryphantes (16—black circle)and Rosamygale (13—gray circle), are indicated, and higher level taxaare to the right of brackets. Dolomedes, Plectreurys, and Euagrus arefrom the families Pisauridae, Plectreuridae, and Dipluridae,respectively.

FIG. 2 shows consensus ensemble repeat units for non-araneoid spiderfibroins. Single letter symbols for amino acids are used, and GGX, GA,and A_(n) motifs are indicated in green, brown, and red, respectively.Plectreurys cDNA1 and Plectreurys cDNA2 were derived from the largerampule-shaped glands of Plectreurys, and Plectreurys cDNA3 andPlectreurys cDNA4 were from the smaller ampullate glands of this spider.

FIG. 3 shows consensus ensemble repeat units for four araneoid fibroinorthologue groups. Single letter symbols for amino acids are used, andGGX, GA, A_(n), and GPG(X)_(n) motifs are indicated in green, brown,red, and blue, respectively. The “[spacer]” region of the MiSp fibroinsis a serine-rich sequence that is 137 amino acids long in Nephilaclavipes (Genbank #AF027735). Nep.c.=Nephila clavipes, Nep.m.=N.madagascariensis, Nep.s.=N. senegalensis, Tet.k.=Tetragnatha kauaiensis,Tet.v.=T. versicolor, Lat.g.=Latrodectus geometricus, Arg.t.=Argiopetrifasciata, Arg.a.=A. aurantia, Ara.b.=Araneus bicentenarius, Ara.d.=A.diadematus, Gas.m.=Gasteracantha mammosa. §m=cDNA from major ampullateglands, §f=cDNA from flagelliform glands, †=PCR/genomic clone,*=previously published sequence. The previous designations for A.diadematus fibroins (4) are shown in parentheses (ADF1-4).

DETAILED DESCRIPTION OF THE INVENTION

The physical characteristics of spider silk proteins confer unparalleledmechanical properties to these fibroins and, thus, render spider silkproteins ideally suited to a variety of applications. Identification ofnovel spider silk proteins as described herein, therefore, providesuseful tools for the generation of natural and synthetic spider silkproteins which can be woven into fibers to imbue fibers comprised ofsuch proteins with unique properties.

In a preferred embodiment of the invention, nucleic acid sequencesencoding novel spider silk proteins have sequences comprising SEQ IDNOS: 1-28.

In a particularly preferred embodiment, spider silk proteins of thepresent invention have amino acid sequences comprising SEQ ID NOS:29-56.

In yet another preferred embodiment, a consensus sequence derived fromSEQ ID NO: 56 has amino acid sequences comprising SEQ ID NO: 57.

Other spider silk proteins have been previously identified, see forexample U.S. Pat. Nos. 5,773,771; 5,989,894; and 5,728,810, the entiredisclosures of which are incorporated herein by reference.

I. Definitions

The following definitions are provided to facilitate an understanding ofthe present invention:

With reference to nucleic acids used in the invention, the term“isolated nucleic acid” is sometimes employed. This term, when appliedto DNA, refers to a DNA molecule that is separated from sequences withwhich it is immediately contiguous (in the 5′ and 3′ directions) in thenaturally occurring genome of the organism from which it was derived.For example, the “isolated nucleic acid” may comprise a DNA moleculeinserted into a vector, such as a plasmid or virus vector, or integratedinto the genomic DNA of a procaryote or eucaryote. An “isolated nucleicacid molecule” may also comprise a cDNA molecule. An isolated nucleicacid molecule inserted into a vector is also sometimes referred toherein as a recombinant nucleic acid molecule.

With respect to RNA molecules, the term “isolated nucleic acid”primarily refers to an RNA molecule encoded by an isolated DNA moleculeas defined above. Alternatively, the term may refer to an RNA moleculethat has been sufficiently separated from RNA molecules with which itwould be associated in its natural state (i.e., in cells or tissues),such that it exists in a “substantially pure” form.

With respect to single stranded nucleic acids, particularlyoligonucleotides, the term “specifically hybridizing” refers to theassociation between two single-stranded nucleotide molecules ofsufficiently complementary sequence to permit such hybridization underpre-determined conditions generally used in the art (sometimes termed“substantially complementary”). In particular, the term refers tohybridization of an oligonucleotide with a substantially complementarysequence contained within a single-stranded DNA or RNA molecule of theinvention, to the substantial exclusion of hybridization of theoligonucleotide with single-stranded nucleic acids of non-complementarysequence. Appropriate conditions enabling specific hybridization ofsingle stranded nucleic acid molecules of varying complementarity arewell known in the art.

For instance, one common formula for calculating the stringencyconditions required to achieve hybridization between nucleic acidmolecules of a specified sequence homology is set forth below (Sambrooket al., 1989):T _(m)=81.5° C.+16.6 Log [Na+]+0.41(% G+C)−0.63(% formamide)−600/#bp induplex

As an illustration of the above formula, using [Na+]=[0.368] and 50%formamide, with GC content of 42% and an average probe size of 200bases, the T_(m) is 57°C. The T_(m) of a DNA duplex decreases by 1-1.5°C. with every 1% decrease in homology. Thus, targets with greater thanabout 75% sequence identity would be observed using a hybridizationtemperature of 42°C.

The term “oligonucleotide,” as used herein refers to primers and probesof the present invention, and is defined as a nucleic acid moleculecomprised of two or more ribo- or deoxyribonucleotides, preferably morethan three. The exact size of the oligonucleotide will depend on variousfactors and on the particular application and use of theoligonucleotide. Preferred oligonucleotides comprise 15-50 consecutivebases of SEQ ID Nos: 1-28.

The term “probe” as used herein refers to an oligonucleotide,polynucleotide or nucleic acid, either RNA or DNA, whether occurringnaturally as in a purified restriction enzyme digest or producedsynthetically, which is capable of annealing with or specificallyhybridizing to a nucleic acid with sequences complementary to the probe.A probe may be either single-stranded or double-stranded. The exactlength of the probe will depend upon many factors, includingtemperature, source of probe and use of the method. For example,depending on the complexity of the target sequence, the oligonucleotideprobe typically contains 15-25 or more nucleotides, although it maycontain fewer nucleotides. The probes herein are selected to becomplementary to different strands of a particular target nucleic acidsequence. This means that the probes must be sufficiently complementaryso as to be able to “specifically hybridize” or anneal with theirrespective target strands under a set of pre-determined conditions.Therefore, the probe sequence need not reflect the exact complementarysequence of the target. For example, a non-complementary nucleotidefragment may be attached to the 5′ or 3′ end of the probe, with theremainder of the probe sequence being complementary to the targetstrand. Alternatively, non-complementary bases or longer sequences canbe interspersed into the probe, provided that the probe sequence hassufficient complementarity with the sequence of the target nucleic acidto anneal therewith specifically.

The term “primer” as used herein refers to an oligonucleotide, eitherRNA or DNA, either single-stranded or double-stranded, either derivedfrom a biological system, generated by restriction enzyme digestion, orproduced synthetically which, when placed in the proper environment, isable to functionally act as an initiator of template-dependent nucleicacid synthesis. When presented with an appropriate nucleic acidtemplate, suitable nucleoside triphosphate precursors of nucleic acids,a polymerase enzyme, suitable cofactors and conditions such as asuitable temperature and pH, the primer may be extended at its 3′terminus by the addition of nucleotides by the action of a polymerase orsimilar activity to yield a primer extension product. The primer mayvary in length depending on the particular conditions and requirement ofthe application. For example, in diagnostic applications, theoligonucleotide primer is typically 15-25 or more nucleotides in length.The primer must be of sufficient complementarity to the desired templateto prime the synthesis of the desired extension product, that is, to beable to anneal with the desired template strand in a manner sufficientto provide the 3′ hydroxyl moiety of the primer in appropriatejuxtaposition for use in the initiation of synthesis by a polymerase orsimilar enzyme. It is not required that the primer sequence represent anexact complement of the desired template. For example, anon-complementary nucleotide sequence may be attached to the 5′ end ofan otherwise complementary primer. Alternatively, non-complementarybases may be interspersed within the oligonucleotide primer sequence,provided that the primer sequence has sufficient complementarity withthe sequence of the desired template strand to functionally provide atemplate-primer complex for the synthesis of the extension product.

Polymerase chain reaction (PCR) has been described in U.S. Pat. Nos.4,683,195, 4,800,195, and 4,965,188, the entire disclosures of which areincorporated by reference herein.

Amino acid residues described herein are preferred to be in the “L”isomeric form. However, residues in the “D” isomeric form may besubstituted for any L-amino acid residue, provided the desiredproperties of the polypeptide are retained. All amino-acid residuesequences represented herein conform to the conventional left-to-rightamino-terminus to carboxy-terminus orientation.

Amino acid residues are identified in the present application accordingto the three-letter or one-letter abbreviations in the following Table:

TABLE 1 3-letter 1-letter Amino Acid Abbreviation Abbreviation L-AlanineAla A L-Arginine Arg R L-Asparagine Asn N L-Aspartic Acid Asp DL-Cysteine Cys C L-Glutamine Gln Q L-Glutamic Acid Glu E Glycine Gly GL-Histidine His H L-Isoleucine Ile I L-Leucine Leu L L-Methionine Met ML-Phenylalanine Phe F L-Proline Pro P L-Serine Ser S L-Threonine Thr TL-Tryptophan Trp W L-Tyrosine Tyr Y L-Valine Val V L-Lysine Lys K

The term “isolated protein” or “isolated and purified protein” issometimes used herein. This term refers primarily to a protein producedby expression of an isolated nucleic acid molecule of the invention.Alternatively, this term may refer to a protein that has beensufficiently separated from other proteins with which it would naturallybe associated, so as to exist in “substantially pure” form. “Isolated”is not meant to exclude artificial or synthetic mixtures with othercompounds or materials, or the presence of impurities that do notinterfere with the fundamental activity, and that may be present, forexample, due to incomplete purification, addition of stabilizers, orcompounding into, for example, immunogenic preparations orpharmaceutically acceptable preparations.

“Mature protein” or “mature polypeptide” shall mean a polypeptidepossessing the sequence of the polypeptide after any processing eventsthat normally occur to the polypeptide during the course of its genesis,such as proteolytic processing from a polyprotein precursor. Indesignating the sequence or boundaries of a mature protein, the firstamino acid of the mature protein sequence is designated as amino acidresidue 1. As used herein, any amino acid residues associated with amature protein not naturally found associated with that protein thatprecedes amino acid 1 are designated amino acid −1, −2, −3 and so on.For recombinant expression systems, a methionine initiator codon isoften utilized for purposes of efficient translation. This methionineresidue in the resulting polypeptide, as used herein, would bepositioned at −1 relative to the mature protein sequence.

A low molecular weight “peptide analog” shall mean a natural or mutant(mutated) analog of a protein, comprising a linear or discontinuousseries of fragments of that protein and which may have one or more aminoacids replaced with other amino acids and which has altered, enhanced ordiminished biological activity when compared with the parent ornonmutated protein.

The term “biological activity” is a function or set of functionsperformed by a molecule in a biological context (i.e., in an organism oran in vitro surrogate or facsimile model). For spider silk proteins,biological activity is characterized by physical properties (e.g.,tensile strength and elasticity) as described herein.

The term “substantially pure” refers to a preparation comprising atleast 50-60% by weight the compound of interest (e.g., nucleic acid,oligonucleotide, polypeptide, protein, etc.). More preferably, thepreparation comprises at least 75% by weight, and most preferably 90-99%by weight, the compound of interest. Purity is measured by methodsappropriate for the compound of interest (e.g. chromatographic methods,agarose or polyacrylamide gel electrophoresis, HPLC analysis, massspectrometry and the like).

The term “tag,” “tag sequence” or “protein tag” refers to a chemicalmoiety, either a nucleotide, oligonucleotide, polynucleotide or an aminoacid, peptide or protein or other chemical, that when added to anothersequence, provides additional utility or confers useful properties,particularly in the detection or isolation, of that sequence. Thus, forexample, a homopolymer nucleic acid sequence or a nucleic acid sequencecomplementary to a capture oligonucleotide may be added to a primer orprobe sequence to facilitate the subsequent isolation of an extensionproduct or hybridized product. In the case of protein tags, histidineresidues (e.g., 4 to 8 consecutive histidine residues) may be added toeither the amino- or carboxy-terminus of a protein to facilitate proteinisolation by chelating metal chromatography. Alternatively, amino acidsequences, peptides, proteins or fusion partners representing epitopesor binding determinants reactive with specific antibody molecules orother molecules (e.g., flag epitope, c-myc epitope, transmembraneepitope of the influenza A virus hemaglutinin protein, protein A,cellulose binding domain, calmodulin binding protein, maltose bindingprotein, chitin binding domain, glutathione S-transferase, and the like)may be added to proteins to facilitate protein isolation by proceduressuch as affinity or immunoaffinity chromatography. Chemical tag moietiesinclude such molecules as biotin, which may be added to either nucleicacids or proteins, and facilitates isolation or detection by interactionwith avidin reagents, and the like. Numerous other tag moieties areknown to, and can be envisioned by the trained artisan, and arecontemplated to be within the scope of this definition.

A “vector” is a replicon, such as a plasmid, cosmid, bacmid, phage orvirus, to which another genetic sequence or element (either DNA or RNA)may be attached so as to bring about the replication of the attachedsequence or element. An “expression vector” is a specialized vector thatcontains a gene with the necessary regulatory regions needed forexpression in a host cell.

The term “operably linked” means that the regulatory sequences necessaryfor expression of the coding sequence are placed in the DNA molecule inthe appropriate positions relative to the coding sequence so as toeffect expression of the coding sequence. This same definition issometimes applied to the arrangement of coding sequences andtranscription control elements (e.g. promoters, enhancers, andtermination elements) in an expression vector. This definition is alsosometimes applied to the arrangement of nucleic acid sequences of afirst and a second nucleic acid molecule wherein a hybrid nucleic acidmolecule is generated.

The phrase “consisting essentially of” when referring to a particularnucleotide or amino acid means a sequence having the properties of agiven SEQ ID NO:. For example, when used in reference to an amino acidsequence, the phrase includes the sequence per se and molecularmodifications that would not affect the basic and novel characteristicsof the sequence.

A “clone” or “clonal cell population” is a population of cells derivedfrom a single cell or common ancestor by mitosis.

A “cell line” is a clone of a primary cell or cell population that iscapable of stable growth in vitro for many generations.

An “immune response” signifies any reaction produced by an antigen, suchas a viral antigen, in a host having a functioning immune system. Immuneresponses may be either humoral in nature, that is, involve productionof immunoglobulins or antibodies, or cellular in nature, involvingvarious types of B and T lymphocytes, dendritic cells, macrophages,antigen presenting cells and the like, or both. Immune responses mayalso involve the production or elaboration of various effector moleculessuch as cytokines, lymphokines and the like. Immune responses may bemeasured both in in vitro and in various cellular or animal systems.Such immune responses may be important in protecting the host fromdisease and may be used prophylactically and therapeutically.

An “antibody” or “antibody molecule” is any immunoglobulin, includingantibodies and fragments thereof, that binds to a specific antigen. Theterm includes polyclonal, monoclonal, chimeric, and bispecificantibodies. As used herein, antibody or antibody molecule contemplatesboth an intact immunoglobulin molecule and an immunologically activeportion of an immunoglobulin molecule such as those portions known inthe art as Fab, Fab′, F(ab′)2 and F(v).

With respect to antibodies, the term “immunologically specific” refersto antibodies that bind to one or more epitopes of a protein or compoundof interest, but which do not substantially recognize and bind othermolecules in a sample containing a mixed population of antigenicbiological molecules.

“Natural allelic variants”, “mutants” and “derivatives” of particularsequences of nucleic acids refer to nucleic acid sequences that areclosely related to a particular sequence but which may possess, eithernaturally or by design, changes in sequence or structure. By closelyrelated, it is meant that at least about 75%, but often, more than 90%,of the nucleotides of the sequence match over the defined length of thenucleic acid sequence referred to using a specific SEQ ID NO. Changes ordifferences in nucleotide sequence between closely related nucleic acidsequences may represent nucleotide changes in the sequence that ariseduring the course of normal replication or duplication in nature of theparticular nucleic acid sequence. Other changes may be specificallydesigned and introduced into the sequence for specific purposes, such asto change an amino acid codon or sequence in a regulatory region of thenucleic acid. Such specific changes may be made in vitro using a varietyof mutagenesis techniques or produced in a host organism placed underparticular selection conditions that induce or select for the changes.Such sequence variants generated specifically may be referred to as“mutants” or “derivatives” of the original sequence.

A “derivative” of a spider silk protein or a fragment thereof means apolypeptide modified by varying the amino acid sequence of the protein,e.g. by manipulation of the nucleic acid encoding the protein or byaltering the protein itself. Such derivatives of the natural amino acidsequence may involve insertion, addition, deletion or substitution ofone or more amino acids, and may or may not alter the essential activityof original the spider silk protein.

As mentioned above, the spider silk polypeptide or protein of theinvention includes any analogue, fragment, derivative or mutant which isderived from a spider silk protein and which retains at least oneproperty or other characteristic of a spider silk protein. Different“variants” of spider silk proteins exist in nature. These variants maybe alleles characterized by differences in the nucleotide sequences ofthe gene coding for the protein, or may involve different RNA processingor post-translational modifications. The skilled person can producevariants having single or multiple amino acid substitutions, deletions,additions or replacements. These variants may include inter alia: (a)variants in which one or more amino acids residues are substituted withconservative or non-conservative amino acids, (b) variants in which oneor more amino acids are added to a spider silk protein, (c) variants inwhich one or more amino acids include a substituent group, and (d)variants in which a spider silk protein or fragment thereof is fusedwith another peptide or polypeptide such as a fusion partner, a proteintag or other chemical moiety, that may confer useful properties to aspider silk protein, such as, for example, an epitope for an antibody, apolyhistidine sequence, a biotin moiety and the like. Other spider silkproteins of the invention include variants in which amino acid residuesfrom one species are substituted for the corresponding residue inanother species, either at the conserved or non-conserved positions. Inanother embodiment, amino acid residues at non-conserved positions aresubstituted with conservative or non-conservative residues. Thetechniques for obtaining these variants, including genetic(suppressions, deletions, mutations, etc.), chemical, and enzymatictechniques are known to the person having ordinary skill in the art.

To the extent such allelic variations, analogues, fragments,derivatives, mutants, and modifications, including alternative nucleicacid processing forms and alternative post-translational modificationforms result in derivatives of spider silk protein that retain any ofthe biological properties of a spider silk protein, they are includedwithin the scope of this invention.

The term “functional” as used herein implies that the nucleic or aminoacid sequence is functional for the recited assay or purpose.

A “unit repeat” constitutes a repetitive short sequence. Thus, theprimary structure of the spider silk proteins is considered to consistmostly of a series of small variations of a unit repeat. The unitrepeats in the naturally occurring proteins are often distinct from eachother. That is, there is little or no exact duplication of the unitrepeats along the length of the protein. Synthetic spider silks,however, can be made wherein the primary structure of the proteincomprises a number of exact repetitions of a single unit repeat.Additional synthetic spider silks can be synthesized which comprise anumber of repetitions of one unit repeat together with a number ofrepetitions of a second unit repeat. Such a structure would be similarto a typical block copolymer. Unit repeats of several differentsequences can also be combined to provide a synthetic spider silkprotein having properties suited to a particular application.

The term “direct repeat” as used herein is a repeat in tandem(head-to-tail arrangement) with a similar repeat.

II. Preparation of Spider Silk-Encoding Nucleic Acid Molecules. SpiderSilk Proteins, and Antibodies Thereto

A. Nucleic Acid Molecules Nucleic acid molecules encoding thepolypeptides of the invention may be prepared by two general methods:(1) synthesis from appropriate nucleotide triphosphates, or (2)isolation from biological sources. Both methods utilize protocols wellknown in the art. The availability of nucleotide sequence information,such as the DNA sequences encoding a spider silk protein, enablespreparation of an isolated nucleic acid molecule of the invention byoligonucleotide synthesis. Synthetic oligonucleotides may be prepared bythe phosphoramidite method employed in the Applied Biosystems 38A DNASynthesizer or similar devices. The resultant construct may be useddirectly or purified according to methods known in the art, such as highperformance liquid chromatography (HPLC).

Specific probes for identifying such sequences as a spider silk proteinencoding sequence may be between 15 and 40 nucleotides in length. Forprobes longer than those described above, the additional contiguousnucleotides are provided within sequences encoding a spider silkprotein.

In accordance with the present invention, nucleic acids having theappropriate level of sequence homology with sequences encoding a spidersilk protein may be identified by using hybridization and washingconditions of appropriate stringency. For example, hybridizations may beperformed, according to the method of Sambrook et al., MolecularCloning, Cold Spring Harbor Laboratory (1989), using a hybridizationsolution comprising: 5×SSC, 5×Denhardt's reagent, 1.0% SDS, 100 μg/mldenatured, fragmented salmon sperm DNA, 0.05% sodium pyrophosphate andup to 50% formamide. Hybridization is carried out at 37-42°C. for atleast six hours. Following hybridization, filters are washed as follows:(1) 5 minutes at room temperature in 2×SSC and 1% SDS; (2) 15 minutes atroom temperature in 2×SSC and 0.1% SDS; (3) 30 minutes-1 hour at 37°C.in 1×SSC and 1% SDS; (4) 2 hours at 42-65° C. in 1×SSC and 1% SDS,changing the solution every 30 minutes.

The nucleic acid molecules described herein include cDNA, genomic DNA,RNA, and fragments thereof which may be single- or double-stranded.Thus, oligonucleotides are provided having sequences capable ofhybridizing with at least one sequence of a nucleic acid sequence, suchas selected segments of sequences encoding a spider silk protein. Alsocontemplated in the scope of the present invention are methods of usefor oligonucleotide probes which specifically hybridize with DNA fromsequences encoding a spider silk protein under high stringencyconditions. Primers capable of specifically amplifying sequencesencoding a spider silk protein are also provided. As mentionedpreviously, such oligonucleotides are useful as primers for detecting,isolating and amplifying sequences encoding a spider silk protein.

Antisense nucleic acid molecules which may be targeted to translationinitiation sites and/or splice sites to inhibit the expression of spidersilk protein genes or production of their encoded proteins are alsoprovided. Such antisense molecules are typically between 15 and 30nucleotides in length and often span the translational start site of aspider silk protein mRNA molecule.

B. Proteins

Full-length spider silk proteins of the present invention may beprepared in a variety of ways, according to known methods. The proteinsmay be purified from appropriate sources, e.g., transformed bacterial oranimal cultured cells or tissues, by immunoaffinity purification.However, this is not a preferred method due to the low levels of proteinlikely to be present in a given cell type at any time. The availabilityof nucleic acid molecules encoding spider silk proteins enablesproduction of the proteins using in vitro expression methods known inthe art. For example, a cDNA or gene may be cloned into an appropriatein vitro transcription vector, such as pSP64 or pSP65 for in vitrotranscription, followed by cell-free translation in a suitable cell-freetranslation system, such as wheat germ or rabbit reticulocytes. In vitrotranscription and translation systems are commercially available, e.g.,from Promega Biotech, Madison, Wis. or Gibco-BRL, Gaithersburg, Md.

Alternatively, according to a preferred embodiment, larger quantities ofspider silk protein may be produced by expression in a suitableprokaryotic or eukaryotic system. For example, part or all of at leastone DNA molecule, such as nucleic acid sequences having a sequenceselected from the group of SEQ ID NOs: 1-28 may be inserted into aplasmid vector adapted for expression in a bacterial cell, such as E.coli. Such vectors comprise the regulatory elements necessary forexpression of the DNA in the host cell positioned in such a manner as topermit expression of the DNA in the host cell. Such regulatory elementsrequired for expression include promoter sequences, transcriptioninitiation sequences and, optionally, enhancer sequences.

The spider silk proteins produced by gene expression in a recombinantprokaryotic or eukaryotic system may be purified according to methodsknown in the art. In a preferred embodiment, a commercially availableexpression/secretion system can be used, whereby the recombinant proteinis expressed and thereafter secreted from the host cell, to be easilypurified from the surrounding medium. If expression/secretion vectorsare not used, an alternative approach involves purifying the recombinantprotein from cell lysates (remains of cells following disruption ofcellular integrity) derived from prokaryotic or eukaryotic cells inwhich a protein was expressed. Methods for generation of such celllysates are known to those of skill in the art. Recombinant protein canbe purified by affinity separation, such as by immunological interactionwith antibodies that bind specifically to the recombinant protein ornickel columns for isolation of recombinant proteins tagged with 6-8histidine residues at their N-terminus or C-terminus. Alternative tagsmay comprise the FLAG epitope or the hemagglutinin epitope. Such methodsare commonly used by skilled practitioners.

The spider silk proteins of the invention, prepared by theaforementioned methods, may be analyzed according to standardprocedures. For example, such proteins may be subjected to amino acidsequence analysis, according to known methods.

A protein produced according to the present invention can be chemicallymodified after synthesis of the polypeptide. The presence of severalcarboxylic acid side chains (Asp or Glu) in the spacer regionsfacilitates the attachment of a variety of different chemical groups tosilk proteins including amino acids having such side chains. Thesimplest and easiest procedure is to use a water-soluble carbo-diimideto attach the modifying group via a primary amine. If the group to beattached has no primary amine, a variety of linking agents can beattached via their own primary amines and then the modifying groupattached via an available chemistry. Jennes, L. and Stumpf, W. E.Neuroendocrine Peptide Methodology, Chapter 42. P. Michael Conn, editor.Academic Press, 1989.

Desirable chemical modifications include, but are not limited to,derivatization with peptides that bind to cells, e.g. fibroblasts,derivatization with antibiotics and derivatization with cross-linkingagents so that cross-linked fibers can be made. The selection ofderivatizing agents for a particular purpose is within the skill of theordinary practitioner of the art.

Exemplary Methods for Generation of Spider Silk Proteins

In view of the unique properties of spider silk proteins, specialconsiderations should be applied to the generation of synthetic spidersilk proteins. The repetitive nature of amino acid sequences encodingthese proteins may render synthesis of a full length spider silkprotein, or fragments thereof, technically challenging. To facilitateproduction of full length silk protein molecules, the following protocolis provided.

The polypeptides of the present invention can be made by directsynthesis or by expression from cloned DNA. Means for expressing clonedDNA are set forth above and are generally known in the art. Thefollowing considerations are recommended for the design of expressionvectors used to express DNA encoding the spider silk proteins of thepresent invention.

First, since spider silk proteins are highly repetitive in theirstructure, cloned DNA should be propagated and expressed in host cellstrains that can maintain repetitive sequences in extrachromosomalelements (e.g. SURE™ cells, Stratagene). The prevalence of specificamino acids (e.g., alanine, glycine, proline, and glutamine) alsosuggests that it might be advantageous to use a host cell thatoverexpresses tRNA for these amino acids.

The proteins of the present invention can otherwise be expressed usingvectors providing for high level transcription, fusion proteins allowingaffinity purification through an epitope tag, and the like. The hostscan be either bacterial or eukaryotic cells. Eukaryotic cells such asyeast, especially Saccharomyces cerevisisae, or insect cells might beparticularly useful eukaryotic hosts. Expression of an engineered minorampullate silk protein is described in U.S. Pat. No. 5,756,677, hereinincorporated by reference. Such an approach can be used to expressproteins of the present invention.

A useful spider silk protein or fragment thereof may be (1) insolubleinside a cell in which it is expressed or (2) capable of being formedinto an insoluble fiber under normal conditions by which fibers aremade. Preferably, the protein is insoluble under conditions (1) and (2).Specifically, the protein or fragment may be insoluble in a solvent suchas water, alcohol (methanol, ethanol, etc.), acetone and/or organicacids, etc. The spider silk protein or fragment thereof should becapable of being formed into a fiber having high tensile strength, e.g.,a tensile strength of 0.5x to 2x wherein x is the tensile strength of afiber formed from a corresponding natural silk or whole protein. Aspider silk protein or fragment thereof should also be capable of beingformed into a fiber possessing high elasticity, e.g., at least 15%, morepreferably about 25%.

Variants of a spider silk protein may be formed into a fiber having atensile strength and/or elasticity which is greater than that of thenatural spider silk or natural protein. The elasticity may be increasedup to 100%. Variants may also possess properties of protein fragments.

A fragment or variant may have substantially the same characteristics asa natural spider silk. The natural protein may be particularly insolublewhen in fiber form and resistant to degradation by most enzymes.

Recombinant spider silk proteins may be recovered from cultures bylysing cells to release spider silk proteins expressed therein.Initially, cell debris can be separated by centrifugation. Clarifiedcell lysate comprised of debris and supernatant can then be repeatedlyextracted with solvents in which spider silk proteins are insoluble, butcellular debris is soluble. A differential solubilization process suchas described above can be used to facilitate isolation of a purifiedspider silk protein precipitate. These procedures can be repeated andcombined with other procedures including filtration, dialysis and/orchromatography to obtain a pure product.

Fibrillar aggregates will form from solutions by spontaneousself-assembly of spider silk proteins when the protein concentrationexceeds a critical value. The aggregates can be gathered andmechanically spun into macroscopic fibers according to the method ofO'Brien et al. [I. O'Brien et al., “Design, Synthesis and Fabrication ofNovel Self-Assembling Fibrillar Proteins”, in Silk Polymers: MaterialsScience and Biotechnology, pp. 104-117, Kaplan, Adams, Farmer and Viney,eds., c. 1994 by American Chemical Society, Washington, D.C.].

Exemplary Methods for Preparation of Fibers From Spider Silk Proteins

As noted above, the spider silk proteins can be viewed as derivatizedpolyamides. Accordingly, methods for producing fiber from soluble spidersilk proteins are similar to those used to produce typical polyamidefibers, e.g. nylons, and the like.

O'Brien et al. supra describe fiber production from adenovirus fiberproteins. In a typical fiber production, spider silk proteins can besolubilized in a strongly polar solvent. The protein concentration ofsuch a protein solution should typically be greater than 5% and ispreferably between 8 and 20%.

Fibers should preferably be spun from solutions having propertiescharacteristic of a liquid crystal phase. The fiber concentration atwhich phase transition can occur is dependent on the polypeptidecomposition of a protein or combination of proteins present in thesolution. Phase transition, however, can be detected by monitoring theclarity and birefringence of the solution. Onset of a liquid crystalphase can be detected when the solution acquires a translucentappearance and registers birefringence when viewed through crossedpolarizing filters.

The solvent used to dissolve a spider silk protein should be polar, andis preferably highly polar. Such solvents are exemplified by di- andtri-haloacetic acids, and haloalcohols (e.g. hexafluoroisopropanol). Insome instances, co-solvents such as acetone are useful. Solutions ofchaotropic agents, such as lithium thiocyanate, guanidine thiocyanate orurea can also be used.

In one fiber-forming technique, fibers can first be extruded from theprotein solution through an orifice into methanol, until a lengthsufficient to be picked up by a mechanical means is produced. Then afiber can be pulled by such mechanical means through a methanolsolution, collected, and dried. Methods for drawing fibers areconsidered well-known in the art. For example, fibers made from a 58 kDasynthetic MaSp consensus polypeptide were drawn by methods similar tothose used for drawing low molecular weight nylons. Such methods aredescribed in U.S. Pat. No. 5,994,099, the entirety of which isincorporated herein by reference.

Of note, spider silk proteins of the present invention have primarystructures dominated by imperfect repetition of a short sequence ofamino acids. A “unit repeat” constitutes one such short sequence. Thus,the primary structure of a spider silk protein can be thought to consistmostly of a series of small variations of a unit repeat. Unit repeats ina naturally occurring protein are often distinct from each other. Inother words, there is little or no exact duplication of a unit repeatalong the length of a protein. Synthetic spider silks, however, can begenerated wherein the primary structure of a synthetic spider silkprotein can be described as a number of exact repetitions of a singleunit repeat. Additional synthetic spider silks can be described as anumber of repetitions of one unit repeat together with a number ofrepetitions of a second unit repeat. Such a structure would be similarto a typical block copolymer. The present invention also encompassesgeneration of synthetic spider silk proteins comprising unit repeatsderived from several different spider silk sequences (naturallyoccurring variants or genetically engineered variants thereof).

Such synthetic hybrid spider silk proteins may each have 900 to 2700amino acids with 25 to 100, preferably 30 to 90 repeats. A spider silkor fragment or variant thereof usually has a molecular weight of atleast about 16,000 daltons, preferably 16,000 to 150,000 daltons, morepreferably 50,000 to 120,000 daltons for fragments and greater than100,000 but less than 500,000 daltons, preferably 120,000 to 350,000 fora full length protein.

C. Antibodies

The present invention also provides antibodies capable ofimmunospecifically binding to proteins of the invention. Polyclonalantibodies directed toward a spider silk protein may be preparedaccording to standard methods. In a preferred embodiment, monoclonalantibodies are prepared, which react immunospecifically with variousepitopes of the spider silk proteins described herein. Monoclonalantibodies may be prepared according to general methods of Köhler andMilstein, following standard protocols. Polyclonal or monoclonalantibodies that immunospecifically interact with spider silk proteinscan be utilized for identifying and purifying such proteins. Forexample, antibodies may be utilized for affinity separation of proteinswith which they immunospecifically interact. Antibodies may also be usedto immunoprecipitate proteins from a sample containing a mixture ofproteins and other biological molecules. Other uses of anti-spider silkprotein antibodies are described below.

III. Uses of Spider Silk-Encoding Nucleic Acids, Spider Silk Proteinsand Antibodies Thereto

A. Spider Silk-Encoding Nucleic Acids

Spider silk protein-encoding nucleic acids may be used for a variety ofpurposes in accordance with the present invention. Spider silkprotein-encoding DNA, RNA, or fragments thereof may be used as probes todetect the presence of and/or expression of genes encoding spider silkproteins. Methods in which spider silk protein-encoding nucleic acidsmay be utilized as probes for such assays include, but are not limitedto: (1) in situ hybridization; (2) Southern hybridization; (3) northernhybridization; and (4) assorted amplification reactions such aspolymerase chain reactions (PCR).

The spider silk protein-encoding nucleic acids of the invention may alsobe utilized as probes to identify related genes from other animalspecies. As is well known in the art, hybridization stringencies may beadjusted to allow hybridization of nucleic acid probes withcomplementary sequences of varying degrees of homology. Thus, spidersilk protein-encoding nucleic acids may be used to advantage to identifyand characterize other genes of varying degrees of relation to thespider silk protein genes of the invention. Such information enablesfurther characterization of nucleic acid sequences which encode proteinsthat possess physical properties typical of spider silk proteins andthus facilitate structure/function analysis of such proteins.Additionally, they may be used to identify genes encoding proteins thatinteract with spider silk proteins (e.g., by the “interaction trap”technique), which should further accelerate identification of othercomponents utilized in webs comprised of spider silk proteins. Moreover,interacting proteins identified in such screens maybe of utility in thegeneration and/or optimization of materials comprised of syntheticspider silk proteins. Spider silk protein encoding nucleic acids mayalso be used to generate primer sets suitable for PCR amplification oftarget spider silk protein DNA. Criteria for selecting suitable primersare well known to those of ordinary skill in the art.

Host cells comprising at least one spider silk protein encoding DNAmolecule are encompassed in the present invention. Host cellscontemplated for use in the present invention include but are notlimited to bacterial cells, fungal cells, insect cells, mammalian cells,and plant cells. The spider silk protein encoding DNA molecules may beintroduced singly into such host cells or in combination to assess thephenotype of cells conferred by such expression. Methods for introducingDNA molecules are also well known to those of ordinary skill in the art.Such methods are set forth in Ausubel et al. eds., Current Protocols inMolecular Biology, John Wiley & Sons, NY, N.Y. 1995, the disclosure ofwhich is incorporated by reference herein.

As described above, spider silk protein-encoding nucleic acids are alsoused to advantage to produce large quantities of substantially purespider silk proteins, or selected portions thereof.

B. Proteins and Antibodies

Purified spider silk protein, or fragments thereof, produced by methodsof the present invention can be used to advantage in a variety ofdifferent applications, including, but not limited to, production offabric, sutures, medical coverings, high-tech clothing, rope, reinforcedplastics, and other applications in which various combinations ofstrength and elasticity are required.

TABLE II lists physical properties of various biological and manmadematerials Material Energy to Strength Elasticity Break Material (N m⁻²)(%) (J kg⁻¹) Dragline Silk 4 × 10⁹  35 1 × 10⁵ Minor Silk 1 × 10⁹  5 3 ×10⁴ Flagelliform 1 × 10⁹   200+ 1 × 10⁵ Silk KEVLAR 4 × 10⁹  5 3 × 10⁴Rubber 1 × 10⁹ 600 8 × 10⁴ Tendon 1 × 10⁹  5 5 × 10³

As shown in Table II, spider silks are characterized by advantageousphysical properties, including, but not limited to, high tensilestrength and pronounced elasticity, that are highly desirable fornumerous applications. It is significant to note that spider silkspossess these physical properties in aggregation which renders themunique proteins having unparalleled utility. For example, spiderdragline silk has a tensile strength greater than steel or carbon fibers(200 ksi), elasticity as great as some nylon (35%), a stiffness as lowas silk (0.6 msi), and the ability to supercontract in water (up to 60%decrease in length). In view of its high tensile strength andelasticity, the energy required to break dragline silk exceeds thatrequired to break any known fiber including Kelvar™ and steel. Theseproperties are unmatched by any known natural or manmade material.Moreover, the new materials of the present invention would also provideunique combinations of such desirable features in a very low weightmaterial.

In view of the foregoing advantageous properties, use of the spider silkproteins disclosed in the present invention as components in materialswould produce superior products. When spider silk is dissolved in anappropriate solvent and forced through a small orifice to generatespider silk fibers, such fibers can be woven into a fabric/material oradded into a composite fabric/material. For example, spider silk fiberscan be woven into fabrics to modulate the strength and elasticity of afabric, thus rendering materials comprising such modified fabricoptimized for different applications. Spider silk fibers can be ofparticular utility when incorporated into materials used to makehigh-tech clothing, rope, sails, parachutes, wings on aerial devices(e.g., hang gliders), flexible tie downs for electrical components,sutures, and even as a biomaterial for implantation (e.g., artificialligaments or aortic banding). Biomedical applications involve use ofnatural and/or synthetic spider silk fibers of the present invention insutures used in surgical procedures, including, but not limited to: eyesurgery, reconstructive surgery (e.g., nerve or tympanic membranereconstruction), vascular closure, bowel surgery, cosmetic surgery, andcentral nervous system surgery. Natural and synthetic spider silk fibersmay also be of utility in the generation of antibiotic impregnatedsutures and implant material and matrix material for reconstruction ofbone and connective tissue. Implants and matrix material forreconstruction may be impregnated with aggregated growth factors,differentiation factors, and/or cell attractants to facilitateincorporation of the exogenous material and optimize recovery of apatient. Spider silk proteins and fibers of the present invention can beused for any application in which various combinations of strength andelasticity are required. Moreover, spider silk proteins can be modifiedto optimize their utility in any application. As described above,sequences of spider silk proteins can be modified to alter variousphysical properties of a fibroin and different spider silk proteins andvariants thereof can be woven in combination to produce fibers comprisedof at least one spider silk protein or variant thereof.

In a preliminary study designed to evaluate the potential for an immuneresponse to a natural spider silk protein, natural dragline silk wasimplanted into mice and rats intramuscularly, intraperitoneally, orsubcutaneously. Animals into which natural dragline silk was introduceddid not mount an immune response to the spider silk protein,irrespective of the site of implantation. Of note, tissue sectionssurrounding spider silk protein implants were essentially identical totissue sections derived from implantation sites into which apolyethylene rod was inserted. Since a polyethylene rod was used as thesolid matrix about which the dragline spider silk protein was wrappedprior to implantation, introduction of a polyethylene rod alone servesas a negative control for the experiment. In view of the above, spidersilk proteins of the present invention are expected to elicit minimalimmunological responses when introduced into vertebrate animals.

Synthetic spider silk fibers are of utility in any application for whichnatural spider silk fibers can be used. For example, synthetic fibersmay be mixed with various plastics and/or resins to prepare afiber-reinforced plastic and/or resin product. Because spider silk isstable up to 180° C., spider silk protein fibers would be of utility asstructural reinforcement material in thermal injected plastics.

It should be apparent from the foregoing that the spider silk proteinsof the present invention and derivatives thereof can be generated inlarge quantities by means generally known to those of skill in the art.Spider silk proteins and derivatives thereof can be made into fibers forany intended use. Moreover, mixed composites of fibers are also ofinterest as a consequence of their unique combined properties. Suchmixed composites can confer characteristics of flexibility and strengthto any material into which they can be incorporated.

Purified spider silk protein, or fragments' thereof, may be used toproduce polyclonal or monoclonal antibodies which also may serve assensitive detection reagents for the presence and accumulation of aspider silk protein (or complexes containing spider silk protein) incells. Recombinant techniques enable expression of fusion proteinscontaining part or all of a spider silk protein. The full length proteinor fragments of the protein may be used to advantage to generate anarray of monoclonal antibodies specific for various epitopes of a spidersilk protein, thereby providing even greater sensitivity for detectionof a spider silk protein in cells.

Polyclonal or monoclonal antibodies immunologically specific for aspider silk protein may be used in a variety of assays designed todetect and quantitate these proteins. Such assays include, but are notlimited to: (1) flow cytometric analysis; and (2) immunoblot analysis(e.g., dot blot, Western blot) of extracts from various cells.Additionally, anti-spider silk protein antibodies can be used forpurification of a spider silk protein and any associated subunits (e.g.,affinity column purification, immunoprecipitation).

From the foregoing discussion, it can be seen that spider silk-encodingnucleic acids, spider silk expressing vectors, spider silk andanti-spider silk antibodies of the invention can be used separately orin combination, for example, 1) to identify nucleic acid sequencesencoding other spider silk proteins or proteins comprising similarmotifs, 2) generate novel hybrid spider silk proteins selected foroptimization of different physical properties, 3) to express largequantities of spider silk proteins or fragments or derivatives thereof,and 4) to detect expression of spider silk proteins in cells and/ororganisms.

The following examples are provided to illustrate an embodiment of theinvention. They are not intended to limit the scope of the invention inany way.

EXAMPLE I Identification of Clones Encoding Spider Silk Proteins

In order to identify novel spider silk proteins and expand the limiteddatabase of nucleic acid sequences encoding fibroin proteins, elevencDNA libraries derived from silk glands of seven spider genera weregenerated. cDNA data were supplemented by information from two genomiclibraries and PCR-amplified sequences (7). Partial cDNA or genesequences for 28 fibroins from seven families of Araneae wereidentified. The data, as described herein, greatly extend thephylogenetic diversity of characterized fibroins (FIG. 1).

Methods for Collection of Sequence Data

cDNA libraries were made from major ampullate glands of Argiopetrifasciata (Araneidae) and Lactrodectus geometricus (Theridiidae),flagelliform glands of A. trifasciata, ampullate glands of Dolomedestenebrosus (Pisauridae), two sets of silk glands from Plectreurystristis (Plectreuridae), and silk glands of the mygalomorph Euagruschisoseus (Dipluridae). Glands from Dolomedes were the four pairs ofspindly ampullate glands with long tails that are connected to thespinnerets via extensive looped ducts. Glands from Plectreurys were thetwo largest pairs of ampule-shaped glands. The relatively uniform silkglands of Euagrus were combined in the RNA extraction for this species.Genomic libraries were constructed for Nephila madagascariensis(Tetragnathidae) and for A. trifasciata. Data from the thirteenlibraries were augmented by PCR amplified genomic sequences from eightaraneoids.

All the silk glands from Phidippus audax (Salticidae) were combined inthe RNA extraction for this species. Similarly, all the silk glands fromZorocrates sp. (Zorocratidae) were combined in the RNA extraction forthis species. Separate cDNA libraries were made from the aciniformglands connected to the median spinnerets and aciniform glands connectedto the posterior spinnerets of Argiope trifasciata (Araneidae).Identical cDNA sequences for an aciniform fibroin were isolated fromboth libraries.

Procedures for construction and screening of the seven cDNA librarieswere as follows. Silk glands were dissected from euthanized spiders andflash-frozen in liquid nitrogen. mRNA was extracted from the glandsusing Dynabeads Oligo(dT)25 (Dynal). cDNA was synthesized using theSuperScript Choice System (Life Technologies) with oligo(dT) as thefirst-strand synthesis primer. Size-fractionated cDNAs (ChromaSpin-1000,Clontech) were ligated into either pGEM-3zf(+)(Promega) andelectroporated into SURE cells (Stratagene), pZErO®-2 cells (Invitrogen)or TOP10 cells (Invitrogen). Eleven libraries of ˜1500 recombinantcolonies each were constructed, and colonies were replicated onto nylonmembranes for screening.

Silk cDNA clones were identified by sequential hybridizations (QuikHyb,Stratagene) with the γ³²P-labeled probes CCWAYWCCNCCATATCCWCC (SEQ IDNO: 58), CCWCCWGGWCCNNNWCCWCCWGGWCC (SEQ ID NO: 59),CCWGGWCCTTGTTGWCCWGGWCC (SEQ ID NO: 60), GCDGCDGCDGCDGCDGC (SEQ ID NO:61), CCWGCWCCWGCWCCWGCWCC (SEQ ID NO: 62), and CCAGADAGACCAGGATTACT (SEQID NO: 63) and through the sequencing of clones selected for largeinsert sizes. The above oligonucleotides were designed based onpublished spider silk sequences (see 4, 5, 8, 9). Hundreds of positiveclones were screened with restriction enzymes, and a subset of cloneswas sequenced (ABI) using standard M13 sequencing primers and byinserting transposons with the Genome Priming System (NEB). Divergenttranscripts were recognized as members of the spider silk fibroin genefamily by internal repetitiveness, sequence similarity to previouslysequenced araneoid fibroins in the nonrepetitive COOH-terminus, andconsistency of sequence translations with amino acid compositions fromdissected silk glands and from published studies (J. Palmer (1985) J.Morphol. 186:195).

Genomic libraries were constructed for the tetragnathid, Nephilamadagascariensis (λXGem-12, Promega), and for the araneid Argiopetrifasciata (λFixII, Stratagene). The genomic libraries were screenedwith the radiolabeled probes CCWCCWGGWCCNNNWCCWCCWGGWCC (SEQ ID NO: 59)and CCWGGWCCTTGTTGWCCWGGWCC (SEQ ID NO: 60). Selected silk gene insertswere excised from the λ arms, subcloned into pGEM (Promega) vectors, andsequenced as above.

Genomic sequences were amplified from the araneoids, N.madagascariensis, N. senegalensis, A. trifasciata, A. aurantia,Tetragnatha kauaiensis, T. versicolor, Latrodectus geometricus, andGasteracantha mammosa. PCR was by standard procedures (Gibco recombinantTaq polymerase) using primers GGTGCTGGACAAGGAGGATACG (SEQ ID NO: 64),GGCTTGATAAACTGATTGACCAACG (SEQ ID NO: 65), and CACAGCCAGAGAGACCAGGATTGC(SEQ ID NO: 66) for MaSp1 and CCAGGAGGATATGGACCAGGTC (SEQ ID NO: 67),CCGACAACTTGGGCGAACTGAG (SEQ ID NO: 68), CAAGGATCTGGACAGCAAGG (SEQ ID NO:69), CAACAAGGACCAGGAAGTGGC (SEQ ID NO: 70), CCAACCAWTTGCGCATACTG (SEQ IDNO: 71), GCTTGAGTTAAAGAYTGACC (SEQ ID NO: 72), and GCAGGACCAGGAAGTTATG(SEQ ID NO: 73) for MaSp2. A PCR reaction generally resulted inproduction of a ladder of DNA fragments. Such ladders result fromannealing of PCR primers to multiple binding sites in the repetitivesequences of a silk gene. For each fibroin amplified, the largest tightband in the PCR ladder was excised and cloned using the TOPO XL PCRcloning kit (Invitrogen). Two to three clones of each silk gene weresequenced as above. Additional sequences from 11 spider fibroins weretaken from GenBank (accession numbers M37137, U03848, M92913, AF027735,AF027736, AF027737, AF027972, AF027973, AF218623, AF218624, U20328,U47853, U47854, U47855, and U47856). These published sequences are fromthe araneoid genera, Nephila and Araneus (FIG. 1).

Results

Like previously published fibroins from spiders (4-5,8-9) andlepidopterans (10-11), the sequences of the invention encode repetitivealanine and glycine-rich proteins. In each molecule, iterated amino acidmotifs are organized into higher-order ensemble repeats. Ensemblerepeats within each fibroin were aligned, and a consensus ensemblerepeat was generated for each molecule (12). In part, silk DNA sequencesfrom non-araneoid spiders (FIG. 2) reiterate the importance of aminoacid motifs that comprise orb-weaver fibroins. GA, GGX, and A_(n) formthe consensus ensemble repeat units of silk fibroins from the pisauridfishing spider, Dolomedes. The association of these three motifs inDolomedes silk proteins mirrors the pattern seen in major and minorampullate fibroins of orb-weavers (FIG. 3). GA, GGX, and A_(n) motifsare also distributed, sometimes sparsely, among ensemble repeat unitsfrom successively more basal lineages of spiders (Haplogynae andMygalomorphae). A_(n) is represented in each of the fibroins from thesetaxa and from all lineages of Araneae studied thus far (FIGS. 2 and 3).Mygalomorphae, tarantulas and their kin, diverged from Araneomorphae,“true” spiders, minimally 240 million years ago in the middle Triassic(13, FIG. 1), thus A_(n) motifs may have been maintained in differentspider silks since that time.

Although the fibroins of Plectreurys (Haplogynae) and Euagrus(Mygalomorphae) are internally repetitive, the ensemble repeats fromthese basal taxa (FIG. 2) are unlike analogous units from previouslydescribed silks (FIG. 3). Each of the fibroins from these primitivegroups contains stretches of serine. Plectreurys cDNA1 is highlyinternally repetitive with iterations of A_(n), S_(n), (GX)_(n), and(AQ)_(n) . Plectreurys cDNA3 has a unique molecular architecture withthe 5′ end encoding a tandem array of long repeat units, and the 3′ endencoding 15 repeats of a much shorter ensemble unit. The ˜346 amino acidEuagrus repeat unit is a complex mixture of serine and alanine-richsequences that includes a string of threonine, an amino acid that israre in araneoid fibroins (FIG. 2).

Aside from an overall modular structure, scattered GA, GGX, and A_(n)motifs, and amino acid matches in the non-repetitive carboxy terminus,there is only limited sequence similarity between araneoid fibroins andthose from Plectreurys and Euagrus (FIGS. 2 and 3). Data presentedherein clearly indicate that spiders utilize a broad diversity offibroin sequences to spin silk threads, such diversity may be areflection of the divergent ecosystems inhabited by these species(1,14). The novel fibroin repeats of basal Araneae suggest that spidersilk design may not be especially dependent on specific sequences, butcomparisons of fibroins among orb-weavers contradict this notion (FIG.3).

In combination with published data (4-5,8-9), these new sequences allowcomparisons between the two basal-most clades of ecribellateorb-weavers, Araneidae and “derived araneoids” (FIG. 1), for four groupsof fibroins (15): major ampullate spidroin 1-like (MaSp1), majorampullate spidroin 2-like (MaSp2), minor ampullate spidroins (MiSp), andflagelliform silk protein (Flag). Differences among fibroins within eachof these four groups are primarily variations in the arrangement andfrequency of A_(n), GA, GGX, and GPG(X)_(n) motifs (FIG. 3). A_(n), GA,and GGX are present in consensus repeats for both araneid and derivedaraneoid MiSp orthologues. Major ampullate fibroins are similarlyconserved among araneoids. Stable repeats for MaSp1 are A_(n), GA, andGGX, and for MaSp2 are GPG(X)_(n) and A_(n). These motifs are retainedeven in major ampullate fibroins of the widow Latrodectus, a cob-webweaving araneoid that does not spin a conventional orb web. The longFlag repeats are divergent within Araneoidea, but both araneid andderived araneoid repeat units are comprised primarily of clusteredGPG(X)_(n) and GGX motifs (FIG. 3).

Fossil evidence suggests that the divergence of Araneidae from derivedaraneoids occurred no later than the early Cretaceous (FIG. 1).Therefore, the motifs conserved within MaSp1, MaSp2, MiSp, and Flag havebeen maintained, presumably by stabilizing selection, for over 125million years (16). Motifs that have been retained over such longevolutionary periods are likely to be critical to the divergentmechanical properties of the specialized orb-weaver silks.

EXAMPLE II Clones of MaSp1- and Masp2-Like Spider Silk Proteins

For the purposes of further classification and structure/functionanalyses of the novel spider silk proteins of the present invention,fibroin sequences were allocated to different ortholog groups of Nephilaclavipes. Silk fibroins are long proteins, comprised largely (>90%) ofensemble repeat units which are internally repetitive (4, 5, 8, 9).Ensemble repeat units from different fibroins vary in length, sometimesby an order of magnitude. It is difficult to make residue-to-residuehomology statements between molecules because of this length variationand the overall modular structure of silk proteins. The grosssimilarities and differences between ensemble repeat units were,therefore, initially used to sort araneoid fibroins into four classes.The following proteins correspond to the MaSp1-like type of fibroinpreviously described in N. clavipes (5, 9).

MaSp1-like group proteins were characterized by short ensemble repeatswith single polyalanine stretches. The remainder of a repeat wascomprised of numerous GGX motifs and scattered GA motifs. The MaSp1-likegroup of spider silk proteins includes N. clavipes MaSp1, and proteinsencoded by a genomic MaSp1 clone from Argiope aurantia (A. aurantia)(SEQ ID NO: 1), an A. trifasciata MaSp1 cDNA (SEQ ID NO: 2), aLatrodectus geometricus (L. geometricus) MaSp1 cDNA (SEQ ID NO: 3), agenomic MaSp1 clone from N. madagascariensis (SEQ ID NO: 4), a genomicMaSp1 clone from N. senegalensis (SEQ ID NO: 5), a genomic MaSp1 clonefrom Tetragnatha kauaiensis (T. kauaiensis) (SEQ ID NO: 6), and agenomic MaSp1 clone from T. versicolor (SEQ ID NO: 7). Amino acidsequences encoded by SEQ ID NOs: 1-7 are provided in SEQ ID NOs: 29-35,respectively.

MaSp2-like group proteins were characterized by short ensemble repeatscomprised of one polyalanine stretch and various iterations ofGPG(X)_(n) and GP motifs. The MaSp2-like group of spider silk proteinsincludes N. clavipes MaSp2, and proteins encoded by a genomic MaSp2clone from A. aurantia (SEQ ID NO: 8), an A. trifasciata MaSp2 cDNA (SEQID NO: 9), a genomic MaSp2 clone from A. trifasciata (SEQ ID NO: 10), agenomic MaSp2 clone from Gasteracantha mammosa (SEQ ID NO: 11), a L.geometricus MaSp2 cDNA (SEQ ID NO: 12), a genomic MaSp2 clone from L.geometricus (SEQ ID NO: 13), two genomic MaSp2 clones from N.madagascariensis (SEQ ID NOs: 14-15), and a MaSp2 genomic clone from N.senegalensis (SEQ ID NO: 16). Amino acid sequences encoded by SEQ IDNOs: 8-16 are provided in SEQ ID NOs: 36-44, respectively.

EXAMPLE III Clones of Flagelliform-Like Spider Silk Proteins

Based on the gross similarities and differences between ensemble repeatunits, the following group of proteins was classified as flag-like typefibroins, similar to those previously described in N. clavipes (5, 9).

Flag-like group proteins were characterized by long ensemble repeatscomprised mainly of clustered GGX and GPG(X)_(n) motifs. Each ensemblerepeat had a single “spacer” region that contained amino acids atypicalof araneoid silks (5). The flag-like group of spider silk proteinsincludes Flag from N. clavipes and two proteins encoded by A.trifasciata Flag cDNA clones (SEQ ID Nos: 17-18). Amino acid sequencesencoded by SEQ ID NOs: 17-18 are provided in SEQ ID NOs: 45-46,respectively.

EXAMPLE IV Clones of Spider Silk Proteins Comprised of Divergent Motifs

Two of the novel spider silk proteins described herein could not beallocated readily into one of the four classes of araneoid fibroinspreviously described in N. clavipes. This group of fibroins includesspider silk proteins comprised of divergent repetitive motifs, a featurewhich may reflect the diverse ecosystems of the species from which thenucleic acid sequences encoding these fibroins were derived. Thiscategory includes proteins encoded by a Dolomedes tenebrosus (D.tenebrosus) fibroin 1 cDNA (SEQ ID NO: 19) and a D. tenebrosus fibroin 2cDNA (SEQ ID NO: 20). Amino acid sequences encoded by SEQ ID NOs: 19-20are provided in SEQ ID NOs: 47-48, respectively.

EXAMPLE V Clones of Spider Silk Proteins Comprised of Atypical Motifs

Seven novel spider silk proteins of the present invention compriseatypical spider silk motifs unlike those described for any previouslycharacterized araneoid fibroin. This group of fibroins includes spidersilk proteins comprised of divergent repetitive motifs, a feature whichmay reflect the diverse ecosystems of the species from which the nucleicacid sequences encoding these fibroins were derived. This categoryincludes proteins encoded by an Euagrus chisoseus (E. chisoseus) fibroin1 cDNA (SEQ ID NO: 21), a Plectreurys tristis (P. tristis) fibroin 1cDNA (SEQ ID NO: 22), a P. tristis fibroin 2 cDNA (SEQ ID NO: 23), a P.tristis fibroin 3 cDNA (SEQ ID NO: 24), a P. tristis fibroin 4 cDNA (SEQID NO: 25), a Phidippus audax (P. audax) fibroin 1 cDNA (SEQ ID NO: 26),and a Zorocrates sp. fibroin 1 cDNA (SEQ ID NO: 27). Amino acidsequences encoded by SEQ ID NOs: 21-27 are provided in SEQ ID NOs:49-55, respectively.

EXAMPLE VI An Exemplary Clone of a Spider Silk Protein Comprised ofDivergent Motifs

A novel spider silk protein of the present invention comprises atypicalspider silk motifs unlike those described for any previouslycharacterized araneoid fibroin. This fibroin is comprised of highlydivergent repetitive motifs and is encoded by a A. trifasciata aciniformfibroin 1 cDNA clone (SEQ ID NO: 28). Amino acid sequences encoded bySEQ ID NO: 28 are provided in SEQ ID NO: 56.

A consensus sequence repeat of the A. trifasciata aciniform fibroin 1protein (SEQ ID NO: 56) comprised of approximately 200 amino acids hasbeen identified herein.

Amino acid sequences comprising the consensus sequence repeat areprovided in SEQ ID NO: 57. Such a consensus sequence is of use in anumber of applications, including, but not limited to: 1) the generationof degenerative nucleic acid probes capable of encoding SEQ ID NO: 57which can used to screen for and identify nucleic acid moleculesencoding novel spider silk proteins, 2) the generation of antibodiesspecific for portions of or all of SEQ ID NO: 57 which can be used toscreen for and identify novel spider silk proteins or derivativesthereof, and 3) utilization as a modular unit in the design andproduction of synthetic spider silk proteins.

EXAMPLE VII Exemplary Methods for Designing Synthetic Spider SilkProteins and Uses Thereof

The following methods for designing synthetic spider silk proteins arebased on the amino acid composition of spider silk proteins and howrepetitive regions of amino acid sequences contribute to thestructural/physical properties of spider silk proteins.

In general, synthetic spider silk proteins can be comprised of a seriesof tandem exact repeats of amino acid sequence regions identified ashaving a spectrum of physical properties. Exact repeats would compriseregions of amino acid sequences that are duplicated precisely.Alternatively, synthetic spider silk proteins can be comprised of aseries of tandem inexact repeats identified as having a spectrum ofphysical properties. Inexact repeats would comprise regions of aminoacid sequences in which at least one amino acid sequence can be alteredin the basic inexact repeat unit as long as the alteration does notchange the spectrum of physical properties characteristic of the basicinexact repeat unit.

In order to increase the tensile strength of minor ampullate silk forapplications where strength and very little elasticity are needed, suchas bulletproof vests, the (GA)n regions can be replaced by (A)n regions.This change would increase the tensile strength. The typical MiSp1protein has sixteen (GA) units. Replacing eight (GA) regions, forexample, with (A) regions would increase the tensile strength from100,000 psi to at least 400,000 psi. Moreover, if the (A)n regions wereas long as the (GA)n regions the tensile strength would increase togreater than 600,000 psi.

To create a fiber with high tensile strength and greater elasticity thanmajor ampullate silk, the number of (GPGXX) regions can be increasedfrom 4-5 regions, which is the range of (GPGXX) regions typically foundin naturally occurring major ampullate spider silk proteins, to a largernumber of regions. For example, if the number were increased to 10-12(GPGXX) regions, the elasticity would increase to 50-60%. If the numberwere further increased to 25-30 regions, the elasticity would be near100%. Such fibers can be used in coverings for wounds (for example, burnwounds) to facilitate easier placement and provide structural support.Such fibers can also be used for clothing and as fibers in compositematerials.

The tensile strength of a very elastic flagelliform silk can beincreased by replacing some of the (GPGXX) units with (A)n regions. Aflagelliform silk protein contains an average of 50 (GPGXX) units perrepeat. Replacing two units in each repeat with (A) regions can,therefore, increase the tensile strength of a flagelliform silk by afactor of four to achieve a tensile strength of about 400,000 psi. Usesfor such flagelliform silk proteins are similar to those described formajor ampullate proteins having augmented elasticity (as describedhereinabove). The flagelliform proteins have additional utility in thatthe spacer regions therein confer the ability to attach functionalmolecules like antibiotics and/or growth factors (or combinationsthereof) to composites comprising flagelliform proteins.

Fibers woven from combinations of the natural and/or synthetic spidersilk proteins of the present invention are also encompassed herein. Suchcomposite fibers have utility in a variety of applications, including,but not limited to, production of fabric, sutures, medical coverings,high-tech clothing, rope, and reinforced plastics.

REFERENCES AND NOTES

-   1. J. Coddington, H. Levi, Annu. Rev. Ecol. Syst. 22, 565 (1991).-   2. F. Vollrath, Intl. J. of Biol. Macromol. 24, 81 (1999).-   3. J. Gosline, P. Guerette, C. Ortlepp, K. Savage, J. Exp. Biol.    202, 3295 (1999).-   4. P. Guerette, D. Ginzinger, B. Weber, J. Gosline, Science 272, 112    (1996).-   5. C. Hayashi, R. Lewis, J. Mol. Biol. 275, 773 (1998).-   6. P. Calvert, Nature 393, 309 (1998).-   7. J. Gatesy, C. Hayashi, D. Motriuk, J. Woods, R. Lewis, Science    291, 2603 (2001).-   8. R. Beckwitt, S. Arcidiacono, R. Stote, Insect Biochem. Mol. Biol.    28, 121 (1998).-   9. Genbank #M37137, M92913, AF027735-AF027737, AF218621-AF218624.-   10. K. Mita, S. Ichimura, T. James, J. Mol. Evol. 38, 583 (1994).-   11. Genbank #AF08334, AF095239, AF095240.-   12. DNA sequences for the various fibroins were translated into    amino acid sequences. For each fibroin, ensemble repeat units,    higher-order aggregations of iterated sequence motifs, were aligned    algorithmically then manually using MacVector (Oxford Molecular),    and a consensus ensemble repeat unit was generated. The consensus    ensemble repeat was defined as the most common amino acid (or gap)    at each position in the alignment of repeats for each molecule. When    there was a tie at an alignment position, X denoted this ambiguity    in the consensus sequence.-   13. P. Selden, J. Gall, Palaeontology 35, 211 (1992).-   14. C. Craig, Annu. Rev. Entomol. 42, 231 (1997).-   15. Araneoid silk sequences initially were allocated to different    fibroin orthologue groups in Nephila clavipes (5,9) according to    sequence characteristics of ensemble repeat units. These assignments    were assessed through phylogenetic analyses of non-repetitive    carboxy-terminal sequences, expression patterns, fit to independent    phylogenetic hypotheses for Araneoidea, and minimization of gene    duplication events.-   16. P. Selden, Palaeontology 33, 257 (1990).-   17. A. Simmons, E. Ray, L. Jelinski, Macromolecules 27, 5235 (1994).-   18. S. Sudo et al., Nature 387, 563 (1997).-   19. X. Qin, K. Coyne, J. Waite, J. Biol. Chem. 272, 32623 (1997).-   20. A. Tatham, P. Shewry, B. Miflin, FEBS Lett. 177, 205 (1984).-   21. B. Thiel, K. Guess, C. Viney, Biopolymers 41, 703 (1996).-   22. Q. Cao, Y. Wang, H. Bayley, Curr. Biol. 7, 677 (1997).-   23. J. Schultz, Biol. Rev. 62, 89 (1987).-   24. L. Chen, A. DeVries, C. Cheng, Proc. Natl. Acad. Sci. USA 94,    3817 (1997).-   25. A. Seidel, O. Liivak, L. Jelinski, Macromolecules 31, 6733    (1998).-   26. B. Madsen, Z. Shao, F. Vollrath, Intl. J. Biol. Macromol. 24,    301 (1999).-   27. G. Hormiga, N. Scharff, J. Coddington, Syst. Biol. 49, 435    (2000).

While certain of the preferred embodiments of the present invention havebeen described and specifically exemplified above, it is not intendedthat the invention be limited to such embodiments. Various modificationsmay be made thereto without departing from the scope and spirit of thepresent invention, as set forth in the following claims.

1. An isolated nucleic acid molecule selected from the group consistingof a nucleic acid sequence encoding a protein of SEQ ID NOS: 36-44. 2.An isolated nucleic acid encoding an MaSP2-like spider silk proteinhaving a plurality of repeating GPG(X)n and An units and a sequencehaving at least 90% identity to a sequence selected from the groupconsisting of SEQ ID NOS: 36-44.
 3. An isolated nucleic acid encodingSEQ ID NO: 36 as claimed in claim
 2. 4. An isolated nucleic acidencoding SEQ ID NO: 37 as claimed in claim
 2. 5. An isolated nucleicacid encoding SEQ ID NO: 38 as claimed in claim
 2. 6. An isolatednucleic acid encoding SEQ ID NO: 39 as claimed in claim
 2. 7. Anisolated nucleic acid encoding SEQ ID NO: 40 as claimed in claim
 2. 8.An isolated nucleic acid encoding SEQ ID NO: 41 as claimed in claim 2.9. An isolated nucleic acid encoding SEQ ID NO: 42 as claimed in claim2.
 10. An isolated nucleic acid encoding SEQ ID NO: 43 as claimed inclaim
 2. 11. An isolated nucleic acid encoding SEQ ID NO: 44 as claimedin claim
 2. 12. An expression vector comprising at least one of thenucleic acids of claim
 2. 13. A host cell comprising the expressionvector of claim
 12. 14. An isolated spider silk protein comprising asequence selected from the group consisting of SEQ ID NOS: 36-44.
 15. Asilk fiber comprising at least one of the proteins of claim
 14. 16. Acopolymer fiber comprising at least two of the proteins of claim 14.